Skip to content

Conversation

@turmelclem
Copy link
Collaborator

@turmelclem turmelclem commented Oct 13, 2025

Content

This PR includes a rework of the retrieval of the network configuration parameters in the mithril-aggregator in order to prepare their decentralization.

Details

Mithril-aggregator

Changes on both leader and follower:

  • Rework EpochService to use the MithrilNetworkConfigurationProvider system from internal/mithril-protocol-config
    • inform_epoch now fetch its current/next/registration AggregatorEpochSettings from the configuration provider
    • rework how future epoch settings are inserted in db : instead of doing it at an offset of +2 in a update_epoch_settings that is run by the state machine after running inform_new_epoch, it's now done within inform_epoch right after retrieving the parameters from the configuration provider and at an offset of +1. This means that the parameters are now stored at the last moment right before their usage instead of ahead of time.
  • Epoch setting store: use an insert or ignore instead of an insert or replace when storing the epoch settings. Now stored epoch settings are considered final, that's why we moved the time of their insertion to right before their usage.
  • Rework handle_discrepancies_at_startup with the aim of making it ready for decentralization:
    • Source its data from the MithrilNetworkConfigurationProvider instead of the aggregator configuration
    • Last change + store change implies that data are now registered for the work epoch window (-1, 0, +1) instead of from -1 to +2
    • Run it later in the dependency injection: after building the ServeCommandDependenciesContainer instead of after building the epoch setting store. This changes limits its call to the serve command, previously other commands could call it even if they did not need it at all.
  • Configuration:
    • protocol_parameters and cardano_transactions_signing_config are now options, but still mandatory for a leader aggregator (the missing configuration error is now handled manually instead of automatically by the config crate)
    • Add optional preload_security_parameter with a default value of 2160. Used to configure the transactions preloader instead of fetching the security parameter in the cardano_transactions_signing_config configuration.
  • Single signature authenticator: log the inner error when the authentication fails, before no context were available in we could not know why an authentication failed
  • Test:
    • strengthen create_certificate_follower integration test by making it check that the follower aggregator works without configured protocol_parameters
    • rework and simplify test tooling attach to the ServeCommandDependenciesContainer:
      • now longer stores data in the epoch_settings table, but only in the signer and signer_registration table. This means that the epoch_settings must have been filled beforehand (either by running handle discrepancies or manually).
      • update usages, notably in the certifier service tests
      • remove now unused methods

Leader aggregator specific:

  • Add LocalMithrilNetworkConfigurationProvider: a MithrilNetworkConfigurationProvider that fetch its data first from the epoch_settings table in the sqlite database, and if an entry is missing for an epoch, it fallback to the usual configuration parameters (protocol_parameters and cardano_transactions_signing_config)

Follower aggregator specific:

  • Use mithril-protocol-config::http::HttpMithrilNetworkConfigurationProvider as its network configuration provider, fetching data from its configured leader aggregator

Mithril-end-to-end

  • Make update_protocol_parameters step leader only. Now it doesn't matter for the follower since it retrieve its configuration from the leader and no longer read its configuration, and keeping it introduced a flakiness since sometimes the follower aggregator restarted before the leader could restart its http server, making the handle discrepancies of the follower fails (since it do a call to the configuration provider).
  • Wrap tailed logs and extracted errors in a ::group:: when running in github action. Disabled for now as there's an remaining issue to tackle first: when the e2e is retry the logs in those groups for the previous iteration are missing in the action output.

Pre-submit checklist

  • Branch
    • Tests are provided (if possible)
    • Crates versions are updated (if relevant)
    • CHANGELOG file is updated (if relevant)
    • Commit sequence broadly makes sense
    • Key commits have useful messages
  • PR
    • All check jobs of the CI have succeeded
    • Self-reviewed the diff
    • Useful pull request description
    • Reviewer requested
  • Documentation
    • Update README file (if relevant)
    • Update documentation website (if relevant)
    • No new TODOs introduced

Comments

There's a know issue for the updated handle_discrepancies_at_startup: it run twice.
This is because the ServeCommandDependenciesContainer is also built twice, once for its purpose, a second time for the http server.
This is harmless as this will record twice the same epoch settings and the second recording will be ignored, but we should probably construct a HttpServerDependenciesContainer instead of reusing the one of the serve command.

Issue(s)

Relates to #2692

@github-actions
Copy link

github-actions bot commented Oct 13, 2025

Test Results

    4 files  ± 0    168 suites  ±0   24m 6s ⏱️ +14s
2 212 tests + 5  2 212 ✅ + 5  0 💤 ±0  0 ❌ ±0 
6 897 runs  +10  6 897 ✅ +10  0 💤 ±0  0 ❌ ±0 

Results for commit df320c5. ± Comparison against base commit c7220be.

This pull request removes 3 and adds 8 tests. Note that renamed tests count towards both.
mithril-aggregator ‑ database::query::epoch_settings::update_epoch_settings::tests::test_update_epoch_settings
mithril-aggregator ‑ runtime::runner::tests::test_update_epoch_settings
mithril-aggregator ‑ services::epoch_service::tests::update_epoch_settings_insert_future_epoch_settings_in_the_store
mithril-aggregator ‑ database::query::epoch_settings::insert_or_ignore_epoch_settings::tests::test_cant_replace_existing_value
mithril-aggregator ‑ database::query::epoch_settings::insert_or_ignore_epoch_settings::tests::test_insert_epoch_setting_in_empty_db
mithril-aggregator ‑ database::repository::epoch_settings_store::tests::save_epoch_settings_does_not_replace_existing_value_in_database
mithril-aggregator ‑ services::epoch_service::tests::inform_epoch_compute_allowed_discriminants_from_intersection_of_aggregation_network_config_and_configured_discriminants
mithril-aggregator ‑ services::epoch_service::tests::inform_epoch_insert_registration_epoch_settings_in_the_store
mithril-aggregator ‑ services::network_configuration_provider::tests::get_stored_configuration_with_stored_value_returns_them
mithril-aggregator ‑ services::network_configuration_provider::tests::get_stored_configuration_without_stored_value_fallback_to_configuration_value
mithril-aggregator ‑ services::network_configuration_provider::tests::test_get_network_configuration_retrieve_configurations_for_aggregation_next_aggregation_and_registration

♻️ This comment has been updated with latest results.

@turmelclem turmelclem force-pushed the ctl/2692-decentralization-of-configuration-parameters-phase-1-local-parameters branch from 407ea74 to bc2e5d5 Compare October 29, 2025 15:22
@turmelclem turmelclem self-assigned this Oct 29, 2025
@turmelclem turmelclem force-pushed the ctl/2692-decentralization-of-configuration-parameters-phase-1-local-parameters branch from 120452b to 4e21205 Compare November 3, 2025 16:02
@Alenar Alenar temporarily deployed to testing-preview November 5, 2025 09:01 — with GitHub Actions Inactive
@Alenar Alenar force-pushed the ctl/2692-decentralization-of-configuration-parameters-phase-1-local-parameters branch from 1b74219 to 890ad0b Compare November 5, 2025 14:58
@Alenar Alenar temporarily deployed to testing-preview November 5, 2025 15:08 — with GitHub Actions Inactive
@Alenar Alenar force-pushed the ctl/2692-decentralization-of-configuration-parameters-phase-1-local-parameters branch 4 times, most recently from 07d14a3 to 80abe55 Compare November 6, 2025 11:12
@Alenar Alenar temporarily deployed to testing-preview November 6, 2025 17:20 — with GitHub Actions Inactive
@turmelclem turmelclem force-pushed the ctl/2692-decentralization-of-configuration-parameters-phase-1-local-parameters branch 3 times, most recently from ea59e83 to de20459 Compare November 7, 2025 14:16
@Alenar Alenar requested a review from jpraynaud November 10, 2025 10:35
@Alenar Alenar self-assigned this Nov 10, 2025
@Alenar Alenar force-pushed the ctl/2692-decentralization-of-configuration-parameters-phase-1-local-parameters branch from de20459 to 2097da6 Compare November 10, 2025 11:19
Alenar and others added 3 commits November 10, 2025 13:04
…gning_config optionnal (mandatory for leader)
Rework `init_state_from_fixture` to not save epoch_settings and works
with the fixed window of three epoch (aggregate/next aggregate/signer
registration), epoch settings should already exists, most of the time
they will be inserted by the handle discrepancies system
Alenar and others added 6 commits November 10, 2025 13:04
- run it at the end of the serve dependency container build
- retrieve and save data from the network configuration provider instead
  of the local node configuration
- update follower integration test to check that local protocol
  parameter configuration is not read, instead the configuration is read
  through the network configuration provider from the leader
Since now the follower read the network config from the leader, this
means that the update of the protocol parameters is now a responsability
of the leader only.

This lead to flakiness because this step was restarting all aggregators,
and sometimes the follower started before the leader and had a error
when it executed its handle discrepencies because the leader http server
was still down.
until we can figure out how to make the logs group work correctly when
the e2e is retry by `nick-fields/retry` (currently the group before the retry are lost)
@Alenar Alenar force-pushed the ctl/2692-decentralization-of-configuration-parameters-phase-1-local-parameters branch from 2097da6 to 6319f63 Compare November 10, 2025 12:07
@Alenar Alenar marked this pull request as ready for review November 10, 2025 12:08
@Alenar Alenar temporarily deployed to testing-preview November 10, 2025 12:17 — with GitHub Actions Inactive
…ction between local configuration and network configuration

For leader aggregator this does not change anything right now since both
value come from the aggregator configuration.
For follower this allow them to use a subset of the signed entity types
allowed in their leader.
@Alenar Alenar deployed to testing-preview November 10, 2025 16:08 — with GitHub Actions Active
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants